CUDA: fix race condition in FA vector kernels #13742

JohannesGaessler · 2025-05-24T09:11:46Z

Looking at the code I added in #13584 again, I think I accidentally introduced a race condition. The mask is being written to shared memory anyways, so the synchronization between warps is achieved by each warp just checking all of the mask values, and then reducing skip within the warp. Each warp will come to the same conclusion regarding whether or not to execute the continue. However, warps are not guaranteed to execute the continue at the same time, and after they do they will write new values to maskf_shared which can in turn influence whether other warps will execute the continue, potentially causing the warps to become desynchronized.

ggerganov · 2025-05-24T09:55:05Z

We should consider adding test-backend-ops tests that exercise this masking logic.

This reverts commit ffd0eae.

CUDA: fix race condition in FA vector kernels

51b597b

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels May 24, 2025

JohannesGaessler mentioned this pull request May 24, 2025

Misc. bug: Eval bug: Repetitive Output After Certain Token Count When Using -np > 1 in llama.cpp (Ver. b5468) #13733

Closed

ggerganov approved these changes May 24, 2025

View reviewed changes

JohannesGaessler merged commit ffd0eae into ggml-org:master May 24, 2025
42 checks passed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request May 25, 2025

Revert "CUDA: fix race condition in FA vector kernels (ggml-org#13742)"

d676ef9

This reverts commit ffd0eae.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: fix race condition in FA vector kernels #13742

CUDA: fix race condition in FA vector kernels #13742

Uh oh!

JohannesGaessler commented May 24, 2025

Uh oh!

Uh oh!

ggerganov commented May 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CUDA: fix race condition in FA vector kernels #13742

CUDA: fix race condition in FA vector kernels #13742

Uh oh!

Conversation

JohannesGaessler commented May 24, 2025

Uh oh!

Uh oh!

ggerganov commented May 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants